Goto

Collaborating Authors

 neural estimator




Towards a pretrained deep learning estimator of the Linfoot informational correlation

arXiv.org Machine Learning

We develop a supervised deep-learning approach to estimate mutual information between two continuous random variables. As labels, we use the Linfoot informational correlation, a transformation of mutual information that has many important properties. Our method is based on ground truth labels for Gaussian and Clayton copulas. We compare our method with estimators based on kernel density, k-nearest neighbours and neural estimators. We show generally lower bias and lower variance. As a proof of principle, future research could look into training the model with a more diverse set of examples from other copulas for which ground truth labels are available.


Beyond Normal: On the Evaluation of Mutual Information Estimators

Neural Information Processing Systems

However, mutual information estimators are typically evaluated on simple families of probability distributions, namely multivariate normal distribution and selected distributions with one-dimensional random variables.




Theoretical guarantees for neural estimators in parametric statistics

arXiv.org Machine Learning

Neural estimators are simulation-based estimators for the parameters of a family of statistical models, which build a direct mapping from the sample to the parameter vector. They benefit from the versatility of available network architectures and efficient training methods developed in the field of deep learning. Neural estimators are amortized in the sense that, once trained, they can be applied to any new data set with almost no computational cost. While many papers have shown very good performance of these methods in simulation studies and real-world applications, so far no statistical guarantees are available to support these observations theoretically. In this work, we study the risk of neural estimators by decomposing it into several terms that can be analyzed separately. We formulate easy-to-check assumptions ensuring that each term converges to zero, and we verify them for popular applications of neural estimators. Our results provide a general recipe to derive theoretical guarantees also for broader classes of architectures and estimation problems.


Accurate Estimation of Mutual Information in High Dimensional Data

arXiv.org Machine Learning

Mutual information (MI) is a measure of statistical dependencies between two variables, widely used in data analysis. Thus, accurate methods for estimating MI from empirical data are crucial. Such estimation is a hard problem, and there are provably no estimators that are universally good for finite datasets. Common estimators struggle with high-dimensional data, which is a staple of modern experiments. Recently, promising machine learning-based MI estimation methods have emerged. Yet it remains unclear if and when they produce accurate results, depending on dataset sizes, statistical structure of the data, and hyperparameters of the estimators, such as the embedding dimensionality or the duration of training. There are also no accepted tests to signal when the estimators are inaccurate. Here, we systematically explore these gaps. We propose and validate a protocol for MI estimation that includes explicit checks ensuring reliability and statistical consistency. Contrary to accepted wisdom, we demonstrate that reliable MI estimation is achievable even with severely undersampled, high-dimensional datasets, provided these data admit accurate low-dimensional representations. These findings broaden the potential use of machine learning-based MI estimation methods in real-world data analysis and provide new insights into when and why modern high-dimensional, self-supervised algorithms perform effectively.


Review for NeurIPS paper: Graph Cross Networks with Vertex Infomax Pooling

Neural Information Processing Systems

Additional Feedback: This paper proposes GXN for modeling graph data. Overall, the proposed approach is quite intuitive, and extensive experiment on graph classification and vertex classification proves the effectiveness of the approach. I have the following concerns regarding the paper: 1. The relation to existing graph pooling techniques is not clear. One key component of GXN is vertex informax pooling, which is used to create multi-scale graphs and is thus related to existing graph pooling techniques.


Review for NeurIPS paper: Graph Cross Networks with Vertex Infomax Pooling

Neural Information Processing Systems

The paper make a novel contribution by introducing graph cross networks, and demonstrate it usefulness in practical example. While initial concern related to the clarity of the paper, the reviewers found that the authors have done a good job in summarizing their work and addressed most of their concerns in the rebuttal. The two key components of GXN are a novel vertex infomax pooling, which creates multiscale graphs in a trainable manner and a novel feature crossing layer, enabling feature interchange across scales. This work has been compared their work with prior methods and surpassed all of them, which meets the bar for a NeurIPS presentation. While it does not impact the decision, during the discussion, the following points were left unanswered, and it would be great if the authors could take the following points in their reviews: (1) In VIPool, P_v, P_n, P_{v,n} are all discrete distributions (although the feature vector can be continuous, as there are V nodes, the sample from P_v can only have at most V values, so it is a discrete distribution).